Interfacing Speech Recognition and Vision Guided
نویسندگان
چکیده
One goal of a pervasive computing environment is to allow the user to interact with the environment in an easy and natural manner. The use of spoken commands, as inputs to a speech recognition system, is one such way to naturally interact with the environment. In challenging acoustic environments, microphone arrays can improve the quality of the input audio signal by beamforming, or steering, to the location of the speaker of interest. The existence of multiple speakers, large interfering signals and/or reverberations or reflections in the audio signal(s) requires the use of advanced beamforming techniques which attempt to separate the target audio from the mixed signal received at the microphone array. In this thesis I present and evaluate a method of modeling reverberations as separate anechoic interfering sources emanating from fixed locations. This acoustic modelling technique allows for tracking of acoustic changes in the environment, such as those caused by speaker motion. Thesis Supervisor: Trevor Darrell Title: Associate Professor
منابع مشابه
Interfacing Sound Stream Segregation to Recognition - Preliminar Several Sounds Si
This paper reports the preliminary results of experiments on listening to several sounds at once. ‘Ike issues are addressed: segregating speech streams from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition (AD). Speech stream segregation (SSS) is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, an...
متن کاملClassification of the Spoken Hindi Partially Reduplicated Words using Artificial Neural Network
The most ordinary way of information exchange is Speech. It provides an efficient way of man-machine communication using speech interfacing. Speech interfacing involves two process, speech synthesis and speech recognition. Speech recognition allows a computer to identify the words that a person speaks to a microphone or telephone. The two main mechanism, used in speech recognition, are signal p...
متن کاملA new speech enhancement: speech stream segregation
Speech stream segregation is presented as a new speech enhancement for automatic speech recognition. Two issues are addressed: speech stream segregation from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition. Speech stream segregation is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, and substitu...
متن کاملInterfacing acoustic models with natural language processing systems
The research presented here focuses on implementation and efficiency issues associated with the use of word graphs for interfacing acoustic speech recognition systems with natural language processing systems. The effectiveness of various pruning methods for graph construction is examined, as well as techniques for word graph compression. In addition, the word graph representation is compared to...
متن کاملA Generic and Visual Interfacing Framework for Bridging the Interface between Application Systems and Recognizers
Application systems that utilize recognition technologies such as speech, gesture, and color recognition provide human-machine interfacing to those users that are physically unable to interact with computers through traditional input devices such as mouse or keyboard. Current solutions to interface application systems with recognizers, however, use an ad hoc approach and lack of a generic and s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014